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1. Introduction. 


Let II,,... •ir be populations whose multivariate observations in 
1 m 

are distributed with respective normal density functions 


Pj(x) 




1 . o T 0-1, o. 

- £, (x-Mj) 


If It is a Riven mixture of members of these populations, then observations 

o 

on n are distributed in iR with density function 
o 


P(x) - a°p^(x) 


for an appropriate set of proportions , • These proportions 

5 0 0 

necessarily satisfy ■ 1 and i 0, 1 ■ l.-,m. In this note, we 

also assume that each is strictly positive. 

We address here the problem of numerically approximat InR the maxlmum- 

likellliooj estlnuites of the parameters (a?,u?,)!?), , determined ny 

I i I I**! , . . . ,m 

samples of two types. Samples of botli types consist of sets (x ,} 

k-1 N, 



2 


of Independent obnervatlons on it . , 1 ■ (The eets (x., } , 

1 ■ comprise the Idunl 1 f led observ.it Ions of such samples, and such 

samples are said to be part lall y Identified . ) We distinguish samples of tin- 
two types according to whether the numbers of identified observations 

contain information about the proportions a^, 1 ■ l,...,ro. If the numbers 
of identified observations contain no information about tlie proportioni, 
then the sample is of the first type; otherwise, tlie sample is of the .'.econd 
type. The following are tvraiaples of how samples of the llrsil ami t-ec »)iid 
types, respectively, might be obtained: 


(1) For i ■ 0,...m, numbers are arbitrarily choosen and Inoepemliut 

observations {x,, } are obtal'.ed from it.. 

k-l.-,Ni 1 

(2) A number K of observations are obtained from tt . For j:ome N K , 

o o t> (> 


of these observations are left unidentified, while the remalninr, 


K - N obfu* .vat ions are identified. For 1 
o o 


1 , . . . ,ni, a subset 


{x,, ) 


of the Identified observations is del ermliu'd wliosi 


k* 1 , . . . , N I 


member observations come from 1T^. 


In the following, we consider likelihood equations determitu-d l>y t lie 
two types of samples which are necessary conditions for a max 1 mum- 1 ike 1 i hood 
e8lim.itc. These equations, which were derived by Coberly (1), suggest certain 
successive-approximations iterative procedures for obtaining nuiximum- llkelihui d 
estimates. These procedures, which are generalized steepest ascent (deflc*eti>l 
gra<ll»*ut> procedures, contain those of llosracr ( 1’ ) as a special case. Using 
at giimeni 1 that parallel those of |3], we sliow that, with pi oh.di i I i I y I .e. 


-i? 
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approaches Infinity (regardless of the relative siren of and 

Nj, 1 ■ these procedures converge locally to tlie strongly 

consistent maxlmum-iricel Ihood estlnutes* whenever the siei>-alze Is between 

0 and 2. Furthermore, the value of the step-size which yields optinuil 
local convergence rates Is bounded from below by a number which always lies 
between 1 and 2. 

2. Samp les of the first type . 

We first assume that numbers (n.) are given aivl that, for 

^ l“0, . . . ,m 

1 " N. independent observations ) are drawn on 

^ k-1 , . . . ,Ni 

llj. The log-1 Ikel Iho«>d function for a sample of this type is 

l-lW) - j", loR lop. . 

In this expression, the parameter vector 0 (witli c<*mpanents 
1 » l,...,m) belongs to the vector space ^ defined in (3], and 
the density functions on the right-hand side arc evaluated with the true 
parameter vector 0 (with components a^, ^ “ 1. •••»«') replaee<l 

by 0. 


*As In (3), one can show that, given any sufficiently small neighbor- 
hood of the true parameters, there is, with probability I as approarltes 

Infinity (regardless of the relative sizes of and N^, 1 * l,...,m), a 

unique solution of the likelihood equations for either type of sample in that 
ncighborliood , and this solution is a mixlmum-llkel iliood efst inuite. 




Differentiating Lj(0) and setting its partial dcrlvatlvi-s to ;«iro 
gl (;c8 the likelihood equal Ions 


(l.a) - Aj(0) 


"o '■** 


(l.h) - M^(0) 


r 

f . Z, X 


Nq / N,, u,p,(x 

+ ,T., X 


k“l Ik k*l ok p(x 


jPl^^Ck ^ 1 //K. ^ r '‘l'’i'^k 


(l.c) T.^ - Sj(0) 


^k-1 ^*ik'*Pi^^’‘lk'Pi^ k-I ^’‘ok"’'l^^’'ok"“i^ 




Ok' / 


{N, + J., ~ -r-^- } 
I k-1 p(x ) 


for i " 1 , . . . ,m. 

We sel 


A(0) “ 

> 

o 

, M(0) « 


, S(0) - 

ri 




\«>i 


Iv"'/ 


and define an operator o!i 


by 


( 0 ) « (1 - c )0 + € 


A(0)' 
M(0) 
k S(0) 


Clearly, for any non-zero c, the likelihood t'<|uatlons are r.atl.stieil hy a 
vertor 0 • t>nly If 0 “ (0). 

We consldei the following iterative procedure: Beginning wltli aoiiie 

Klariing value define successive iterates Inductively i»y 


Cl 


( 2 ) 
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for J ■ 1, 2, 3,... . Our local convergence renult for tills iterative 

procedure, as stated In the introduction, follows imnu‘dlntely from the 
theorem below. 

Theo rem 1; With probability I as approaches Infinity, Is .i lora'.y 

contractive operator (in some norm on j ) near the strongly conslsiiiii 

nutximum- I Ikel Ihood estimate whenever 0 < t < 2. 

In saying tli.it is a locally contractive operator near a point 

t) t we mean that there is a vector norm II II on ^ aii.l 

a mimlu-r X, 0 X < I , such that 

||<J.^(0’) - ON s X||0’ - ON 
whenever O' lies sufficiently near 0. 

Proof of T heor em 1; Let 



l>e the stroll), ly consistent maximum- 1 f kel Ihood estiimite. W.- assumi* that 


I 


(. 


a. ^ 0, 1 ■ (Aa N_ approaches infinity, the probability is 1 

i 0 

that this is the case.) As in [3], it suffices to show that, with 


probability 1, V<t>^(0) converges to an operator which has o(>crutor noim 

less than 1 with respect to a suitable vector norm on 
Now 


Vt (0) - (1 - €)1 + c V 

c 



and wo wrllo 



f*\ 

( V-A 

V 



V 

rl ■ 

V-M 

a 

VjH 

''f" 

• 


U j 

, V-S 


V-tS 
^ J 


Define inner products 

< 1 >1 on JY? . 

<.>•; 

on ^ 

, and < , * on 

as in (3J. 

Setting 





P, (x) 

"i**' ■ Tw- ■ 

■ (X - D^), 

<Sj(x) 


(x - li^)(x - )ij)^-l I ,K J 


e t I ►' 

I i '• 


for i " l,...,ro, one calculates 


N 

V-A(O) - 1 - (diag Cl ) 77- 

“ ^ 1 



1 

9-A(0) - - (diag a ) ^ 

^ 1 


1 N 

v„A((n - (dl.ig u ) 





<fi Y 

lu m m 


I ,0| . I 


m III ni 


V M(0) 

a 




®l ( ^0 

(dIaK ~)| r 
•^1 


1 


P,Yl\ 


(! Y„ 

IQ m 


V^lt(O) .. (cIlnR-i T. - (dlagj^) I F 

Y»(0) - (dl„. T. - (dlag -h{ }. 


1 


n 1 




‘"in-- : 


<ft y 

mm m 


;■ 


'll 4 •• " 

III lU III 


Fi 


V“S(0) - Ml.ir. E 3^6^) - (ding 


1 T T N., 

vySO)) - (diag^ [-T. [(OYinjC-)^] - I (( ^ t 



f 


- (diay 


••— >J? 





<P y , • ■• ' 
m'ni’ m ) 


V-j|S( 0 ) 




a.!:, 

(dlag — -) I 


ri*l\ 


6 

m m 


'S'' I ’ 


\u in 


Here, the arguments of Bj»Yj ond 6 ^ can be di’tfimlned from the 1 ml let' 
of summation, e.g., 


? “ kSi • 



H 


Set t iiip. 
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Wc have aHm»rac-d that 0 is the strongly consistent nuixlmum- 1 ikel IIi.mhI 

estimate. Then, regardless of the relative sizes of Nj .nnd N.^, one can 

show as in (ll that, with probability 1, {V'^^(0) - K(V1' (0 ))) (onverjn-s 

to zero as approaches infinity. Now 

I 0 0 


0 (dlap I) 0 

*'1 (t N 

0 0 (dl.iR - 1) 

•'l 


(ding u°) 0 

0 (dlag~l) 

1 

0 0 (dl.iR 


- 11(1 - QR). 


where 

1 0 0 

<t'*N 

II* 0 (dlag--“l) 0 

0 n (dlag I) 

•"l 

(di.ng ttj) 0 0 

Q » 0 10 

0 0 (dlag JI°) 


U - I V(x) <V(x) , • >p(x)dx . 




/A(0°)\ 

K(V M(0°) I ) - 

\S(0°)j 
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It wna shown In (3] tJiat QK 1« posit Ive-dof Inl to ami nynru-trlc with 
oporator norm less than 1 wltli respect to the Inner pro(Ju< L <’,Q '‘n 

. It follows that 1-QR Is posltlve-cloflnlt** nml synuiuirlo wlili 
norm less than 1 with respect to <• ,Q Slnoe B ami 0 commuto, 

*•> Is an Inner product on , and one sees t hat 

<W,q”'w> ' <W,q”^b”^W> for W € . Consequently, B(l-(>l<) 1;. 

posl t Ive-def Ini to and syimnetric with norm less tlian 1 with respis t to the 
Inner product <*,<} *’"> . One concludes th.it 


E(V4'^(0 )) - (1 - c)I + c F.(V 


/A(0®)\ 
M(0°) 
VS(0‘^)/ 


h.is norm less thin 1 with respect to <’,<) B *> whenever 0 • 1*. 

Till, rompletes llie prool ol the theorem. 

W*' remark that, reasoninj; as In (3l, one m.iy determine a partiiul.ii 
value of I (the "optimal t") which yields, with probahlllly 1 .as 
npproaclies infinity, the fastest asymptotic uniform rates of local convi r- 
pence of tlie iterative procedure (2) near 0. This optimal c Is plven hv 


2 

2 - (T+P) 


wliere p .ind t are, respectively the largest and smallest eiginv.ilues ol 
B(l-QK) regarded as .in operator on ^ is the siihsp.iee ol f 1 *^ 

wh^se coniponi nt sum to zero.) Since p and . lie hetween zero and I, 

one sees tliat the optinul c is alw.iy.s greater than 1. II the ei'nipoin nt 
populations are "widely sepaiated," then p and x .ire near n-ro ami. 



II 


hcncu, thv optimal < i« near 1. If two or more of tite compuiu'iit popti l.it 1oir> 
arc nearly indisC infulaliable and If is larKc relative to i lie N^'h, 

then T is near zero, and the optimal € cannot be murh smallei than 2. 


3 . Samples of the second typ e. 

We now assume that K observations are obtained from the mixture 

o 

population IT , and that, for some N < K , N of these ohservat iunn 
' ' o o o‘ o 

arc left unidentified, while the renviinlnt* observations are 

identified. For i • 1 let (x., } denote the Mihsel ol 

k-1 N'l 

the identified ohsei at i ns which come from ‘!t , .ind let (x , } 

‘ k 1 \ 

be the set of unldenlltiid observations from li . 1'he loe- 1 ike i ilmoil 

o 

function for this sample is 


N, N, 


m »1 


Lj(0) . Us -VJ •••>„) + ,S, loK Pi'«n> + sL:, I'M. 


m 


m 


( E N )! „ N, N 

'•'S 'nT.V.k i> * i5i k^i + k^j i’<»ok’ 

1 Ul 


l)i f ferent i.iL ing I -2 nnd settinp, its part ial derivatives to :'.ero pives 
the likeliliood equations 


( «. 0 


A,(0) 



No 

kh 


''Kk^ 


(3.h) 


- Mj(0) 


S^(0) 


(3.C) 
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for I ■ 1 , . . . ,m. 
Wo Bft 


A(0) 


Aj(0) 


m 


and dpf Im- an oprrator 't on hy 


♦^(0) - (1 - t)0 + c I M(0) 



Our i Ipiat Ivi* proo«>duri* Is iho following: !U‘p,Innlnj* witli uonn* st.irrinr, 

vnlno dffliip sini'fss ivu ItiTatPS Ir. i clivt iy hv 


(4) 


o(J+i) 




) 


for j “ 1.2.3,... . As belort*. the desired loeal converj'i'nce result li»r 

this Iterative procedure follows from the theorem 1h*1ow. 


T heorem 2 ; Witli prohahillty 1 as approaches infinity, Is a local 'y 

contractive operator (In some norm on (7(€yiT<*J ) near the stront;ly consisteii 
max 1 mum- 1 1 ke 1 i hood cstim.ite whenever 0 < c < 2. 


Pr oof ^f Tlu'ore m .2 ; If 0 Is the strongly consistent nuixlmum- 1 1 ke 1 Jho«>i! 
estimate, then, as before. It suffices to show that, with probability 1. 

A 

(0) converges as approaches infinity to an operator which has 

operator norm less than 1 with respect to some v»'ctor norm on 


Ptoceeding as before, one sees th;>t 



13 







Tbt r«‘m.iinlnp, Frt-chft dcrlvatlvns, l.e., tlic «1 it iv.il Ivos .it 0 of M .uut 
S witlt respect to a, p, and are unchanpeJ, t'xcept that nmsi !<r 

replaced by wherever it appears. 

One obtains at 0 
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previously, except that each In the latter is replaced by 

the former. One verifies that, with probability 1 as approaches 

infinity, (4) has the same limit as B(I-QR), where Q and R are as 
~ % 

before and B ■ — I. Repeating our earlier reasoning, one verifies that 

B(I-QR) is positive-definite and symmetric with norm less than 1 with 

- 1 — 1 

respect to the inner product <*,Q B *> . Hence 

/ A(0) 

V?^(0) - (1 - c) cV I M(0) 

\S(0) 

converges to an operator which has norm less than 1 witl> respect to 

<*,Q ! > whenever 0 < c < 2. This completes the proof of tlie thiorem. 

The rc'm.irks concerning the "optimal c" at the conclusion of the 
pncedlng section are valid here verbatim. 
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